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Abstract. This chapter explores the correlation between centering and different forms of pronominal ref- 



erence in Italian, in particular zeros and overt pronouns in subject position. In previous work (Di Eugenio 
1990| ), I proposed that such alternation could be explained in terms of centering transitions. In this chapter 



I verify those hypotheses by means of a small corpus of naturally occurring data. In the process, I extend 
my previous analysis in several ways, for example by taking possessives into account; I also provide a more 
detailed analysis of continue: more specifically, I show that pronouns are used in a markedly different way 
in CONTINUe's preceded by another continue or by a shift, and in CONTINUe's preceded by a retain. 
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1 Introduction 

Italian is a pro-drop language, in that the subject of a clause need not be overt. Thus, an Italian speaker has 
a variety of choices in realizing a subject, including using a null subject or an overt pronoun. In previous 
work (Di Eugenio, 1990), I proposed that the alternation of null and overt pronominal subjects could be 
explained in terms of centering transitions. However, the hypotheses I put forward in my earlier work were 
supported only by a few constructed examples. In this chapter, I report on a corpus study that I conducted 
in order to find more solid evidence for those hypotheses. Analyzing real data had the added benefit of 
bringing me to address issues still problematic for centering, such as how possessives and subordinates affect 
the ordering on the Cf list, and to provide a more detailed analysis of CONTINUe's. 

The version of centering that I use is basically the one described in the overview by Walker, Prince and 
Joshi (this volume). However, the ordering for the Cf list I use is modified with respect to the usual one 
postulated for Western languages: 

(1) SUBJECT > OBJECT2 > OBJECT > OTHERS 



Various researchers (Kameyama, 1985; Walker, lida, and Cote, 1994) had pointed out that in languages 



such as Japanese, empathy and topic marking affect Cf ordering. Turan in (1995) argues that the notion 



of empathy is relevant to Western languages as well, because of non-agentive psychological verbs, such as 
interest, seem; perception verbs, such as feel, appear; and in general, expressions that refer to a character's 
point of view, such as The thought crossed her mindly Turan points out that, with such expressions, it 
is the experiencer, which is often in object position, rather than the grammatical subject, that should be 
ranked higher. Moreover, Turan points out that in her Turkish corpus quantified indefinite subjects (qis) 
and arb itrary plural pro's (prOa^fj) rank very low. Therefore, the Cf ranking in (^ should be amended as 
follows ( [Turan, 1995|) :F| 

(2) empathy > subject > OBJECt2 > object > others > qis, prOarb 

In this chapter, I will adopt (0). 



'^For a thorough treatment of subjective expressions and tracking characters' points of view sen IWiehe, 1994 ^ 
■^The first researcher to propose that empathy should precede subject in the Cf ranking was (Kameyama, 1985) 
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Another difference between tfie standard notion of centering and the way I apphed it is that I don't 



explicitly take discourse segment boundaries into account. According to ( Grosz, Joshi, and Weinstein 
1995| ), centering is a local mechanism that applies within a single specific discourse segment, but not across 



segment boundaries. However, segmenting discourse is an active research area in itself, and there are no 
texts with agreed upon discourse structures. It seems that, when analyzing naturally occurring text, two 
approaches are possible: 

• The first is to postulate a s egment struct ure for the text of interest, for example exploiting paragraph 
boundaries, cue words etc ( Walker, 1989| ). 



• The second is to disregard segment boundaries and apply centering between every two adjacent utter- 
ances. It is in fact possible that the absence of a centering transition between two utterances indicates 
a segment boundary — see Passonneau (this volume). Walker (this volume). This is the approach I 
adopted, as my interest is in using centering as an explanatory tool for the distribution of pronominal 



forms. Notice that the cues that (Walker, 1989) uses to provide a discourse structure prior to applying 



centering include whether the first sentence of a new paragraph 

has a pronoun in subject position or a pronoun where none of the preceding sentence- internal 
noun phrases match its syntactic features. 

In these cases. Walker doesn't consider the paragraph as constituting a discourse segment separate 
from the preceding one. 



Finally, note that in this paper, as in my previous work, I take the same position advocated in (Walker 



lida, and Cote, 1994) 



that the interpretation of zeros is an inferential process, but that syntactic information provides 
constraints on this inferential process. 

I will suggest that it is the syntactic context up and including the verbal complex that affects the interpre- 
tation of null subjects]^ 

The chapter is organized as follows. In Sec. ^I discuss the Italian pronominal system and the hypotheses 
from my previous work. Sec. ^ describes the corpus I used and details the distributions of various referring 
expressions in subject position. In Sec. ^ I first discuss assumptions and extensions I had to make in order 
to apply centering to naturally occurring text; and then report on the correlations between pronouns and 
centering transitions. Sec. |^ analyzes such correlations: in particular, I refine the notion of continue in 
order to account for a non negligible number of occurrences of strong pronouns. Finally, Sec. ^ presents 
conclusions and future work. 



2 Italian pronouns and Centering 
2.1 The Italian pronominal system 

In Italian, there are two pronominal systems, characterized by a different syntactic distribution: weak 
pronouns, that must always be cliticized to the verb (la, lo, li, le, gli - respectively her, accusative; him, 
accusative; them, masculine accusative; them, feminine accusative or her, dative; him, dative), and strong 
pronouns (lui, lei, lore - respectively he or him; she or her; they or them).^ The null subject is considered 
part of the system of weak pronouns. 

^I'm using here the term inferential processing, and later terms such as strategies, in their intuitive sense. Susan Brennan 
(p.c.) brought to my attention the difference between strategic or inferential processing and automatic processing, and the fact 
that syntactic cues of the kind I discuss in this paper affect the latter, not the former. 

^Traditionally, lui, lei, lore were the oblique forms of the strong system, with the nominative forms being respectively egli, 
ella, essl/e: however, in current Italian the latter forms are only rarely used as the oblique forms have replaced them in subject 
position — among the instances of strong pronouns in Table jsj there are only three occurrences of egli, and all three of them 



occur in (Pagetti, 1993) 
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In Italian there is no neuter gender: nouns referring to inanimate objects are masculine or feminine. The 
weak pronouns used in this case are those of the corresponding gender. However, strong pronouns can't refer 
to inanimate objects, so that paraphrases or deictics are used: a strong pronoun for inanimate objects does 
exist — esse / essi for masculine, singular and plural, essa / esse for feminine, singular and plural — but it 
is not much used in current Italian: there is only one instance of esse in my corpus, in (Pagetti, 1992).^ 



Weak and strong pronouns are often in complementary distribution, as strong pronouns have to be used 
in prepositional phrases, e.g. per lui, for him, as in Ex. (^d). However, this syntactic alternation doesn't 
apply in subject position: the choice of null versus strong pronoun depends on pragmatic factors, and can 
be accounted for in terms of centering transitions. 

(3a) Mariaj e andata in vacanza con suo padre^: 
Maria; is gone on holiday with her father^: 

(3b) e stato un vero piacere per *l0j/luij. 
(it) is been a real treat for him_,-. 

2.2 Previous results 



In (Di Eugcnio, 199C), I proposed that 



(4a) Typically, a null subject signals a CONTINUE, and a strong pronoun a RETAIN or a SHIFT. 

(4b) A null subject can be felicitously used in cases of RETAIN or shift if Ui provides syntactic features 
that force the null subject to refer to a referent different from Cb(C/i_i). Moreover, it is the syntactic 
context up to and including the verbal form(s) carrying tense and / or agreement that makes the 
reference felicitous or not. 

These claims stemmed from (constructed) examples such as the following, where I use referents of different 
gender — Maria, female proper name; Giovanni, Giorgio, male proper names — to show how gender and 
morphological markings come into play when resolving reference. These examples are not ambiguous in 
English, given that a null subject is not an option available to a speaker. Boldface is just meant to highlight 
referential expressions, not to indicate stress; pronouns in parentheses in English correspond to zcros|^ in 
Italian; also remember that lui, gli, lo are masculine, and lei, le, la feminine. 

(5a) Maria; voleva andare al mare. 
Maria; wanted to go to the seaside. 

(5b) 4>i Telefono' a Giovanni^. 
(She;) phoned to Giovanni^. 

(5c) i. (pi Si arrabbio' perche' (pi non loj trovo' 

(She;) self got angry because (she;) not hinij found at home. 

ii. 0;/?j Si arrabbio' perche' cpj stava dormendo. 
(She;)/(?Hej) self got angry because (he^) was sleeping. 

iii. Luij si arrabbio' perche' (jjj stava dormendo. 
Hej self got angry because (he^) was sleeping. 

iv. (pj Si e' arrabbiatOmasc perche' <pj stava dormendo. 
(Hej) self is become angry„iasc because (he^) was sleeping. 

Consider the four (||c) variations; notice that Maria is both Cb(^) and Cp(||b): 

^This is the same article in which also the three occurrences of egli appear, see fn. ^ 

®I will occasionally use the term zero: the speaker should keep in mind that Italian allows to drop only subjects, at least as 
a general rule. 
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(|^).i The null subject refers to Maria, which is then both Cb(||c.i) and Cp(||3.i). (||c).i thus realizes a 

CONTINUE. 

(|^).ii The most natural interpretation is that the null subject in the main clause refers to Maria — the null 
subject in the subordinate clause is forced to refer to Giovanni on pragmatic grounds]^ For this same 
pragmatic reason, the null subject in the main clause may be interpreted as referring to Giovanni, but 
the discourse sounds less coherent. 

(|^).iii As Giovanni was neither Cb(||b), nor Cp(^), S performs a felicitous SMOOTH-SHIFT by referring to 
Giovanni with a strong pronoun. 

(|^).iv Contrast this utterance with (||c).ii. They should have the same effect on the hearer, namely, the 
null subject should be interpreted as referring to Maria: instead in (|^).iv it is felicitously interpreted 
as referring to Giovanni. This is due to the fact that in (|^).iv the verb is present perfect]^ the past 
participle agrees with the subject, and its masculine morphology forces the null subject to refer to 
Giovanni, and not to Maria. This last alternation lends support to my claim that it is the context up to 
and including the verb that is taken into account when interpreting a zero: it is the fact that the main 
verb is marked for masculine that allows the null subject to refer to something different from Cb(P3). 

Further evidence for the importance of clues up to and including the verb comes from clitics, more 
specifically, from clitics embedded in a modal or control verb construction, as Exs. (^).i through iii show. 
The crucial features is that clitics may be cliticized to the infinitival complement of the higher verb, as in 
(^).i and (|^).ii, or can climb in front of the higher verb, as in (^).iii. 

(6a) (pk ho parlato con Maria^ ieri. 

(Ifc) have talked with Maria^ yesterday. 

(6b) (pi E arrabbiata/e,„ con Giorgioj: 
(Shci) is angry/em with Giorgio^: 

(6c) i. non vuole piii parlarglij. 

(shei) not wants any more to talk to hinij. 

ii. ^ (pj non vuole piii parlarle^. 

# (hej) not wants any more to talk to her^. 

iii. non le^ vuole piii parlare. 
(hej) not to-her^ wants any more to talk. 

(|^).i realizes a continue, with the null subject referring to Cp(|6|3) = Cb(^), namely Maria. 

(I^).ii is incoherent. The preferred interpretation for the null subject is Maria; however, when the clitic le 
is found at the end of the sentence, the hearer is forced to change the interpretation of the null subject 
to Giorgio. The effect is similar to a syntactic "garden path". 

(|^).iii is acceptable, as the clitic le, that in (||c).ii is cliticized onto parlare, climbs in front of the modal 
verb vuole: so the hearer is forced to exclude Maria as referent of the null subject. This happens early 
enough so that no "garden path" effect is registered. 



'^In (Di Eugenio, 1990) I was not addressing the issue of interpreting null subjects in subordinate clauses. 

^Therc is a grammatical temporal incoherence between (Hc).iv and the preceding discourse, as the former is in the present 
perfect, while the latter is in thesimple p^t. However, this temporal incoherence does not affect resolution of pronoun reference, 
as we can change the tenses in (fek) and lab) to make the whole discourse temporally coherent, and the same kind of pronominal 
reference occurs. The coherent discourse is: Maria^ vuole andare al mare, (pi Ha telefonato a Giovannij. (f>j Si e' arrabbiato 
perche' (f>j stava dormendo. which translates to Maria^ wants to go to the seaside. (She;) has phoned to Giovannij. (Hej) 
self is become angrymasc because (hej) was sleeping. The verb volere, to want, in the first clause cannot be used in the present 
perfect in this context. 
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It is from the contrast between (|^).ii and (|^c).iv, and between the three (|^c) variations that my claim about 
the importance of the context up to and including the verbal complex stems. Clearly, what exactly the 
context up to and including the verbal complex amounts to is not clear, as it includes agreement features, 
such as in (|^c).iv, but not the clitic which hasn't climbed, as in (^).i and (|^).ii: it apparently includes all 
the verbal forms carrying tense and agreement features, which explains why the past participle, marked for 
gender and number, is included, while the infinitival complement of a modal or control verb is not. The 
observation that the context up to and including the verbal complex affects the interpretation of the subject 
also makes sense from a lexical semantics point of view, given that the lexical semantics of the verb affects 
pronoun interpretation. 

As we will see in Sec. 5.2, I found only few examples of such configurations in the corpus: while they 
support the hypothesis in (^d), more data is required to come to definitive conclusions. Moreover, it is clear 
that psycholinguistic experiments are needed to determine, among others: whether cases such as (|5|3).iv, a 
SMOOTH-SHIFT, require more time to process than cases such as (|^c).i, a continue, even if both involve a 
null subject; whether a SMOOTH-SHIFT overtly marked by a strong pronoun such as in (||c).iii requires less 
processing time that one encoded by a zero and supported by syntactic features, such as in (^).iv; whether 
indeed effects analogous to syntactic garden paths occur in cases such as (@c).ii. 



3 Corpus 

The co rpus amounts to a b out 12,000 wo rds (rough ly 25 pages of text). It is composed of excerpts from two 
books (|von Arnim, 1989 ; Fallaci, 1989), a lette r ( Mila, 1993 ), a posting on the Italian e lectronic bulletin 
board (SCI, 1994), a short story (Nichetti, 1993), and three articles from two newspapers ddel Buono, 1993| ; 



Pagetti, 1993; La Nazione, 1994). The excerpts are of different lengths, with the excerpts from the two 
books being the longest, ( von Arnim, 1989| ) with 3,641 word^ and ( Fallaci, 1989| ) with 1,918, and the 



posting on the Italian bulletin board, with only 603 words, the shortest. Texts were chosen to cover a variety 
of contemporary written Italian prose, from formal (newspaper articles about politics and literature), to 
informal (posting on the Italian electronic bulletin board). 

The corpus I'm reporting about is a subset of the initial materials I assembled. In fact, I had to choose 
prose that describes situations involving several animate referents, as strong pronouns can refer only to ani- 
mate referents.^ Moreover, I eliminated texts that contain direct speech, another thorny issue for centering 
and in general for theories of discourse processing; excerpts from the two books don't contain dialogues, [von 



Arnim, 1989) is a book written in the form of a diary, which explains the presence of first person pronouns; 



the diary format has the advantage that there are almost no dialogues, which instead appear in the usual 



novel format. The excerpts from (von Arnim, 1989) involve at least two people, and possibly at least two 



are of the same sex; the chosen excerpts are discontinuous because, first, the diary format is in itself discon- 
tinuous; second, as the book revolves around the author's interest in gardening, many pages discuss plants 
and flowers or describe the landscape, and they obviously don't provide the required animate referents. The 
excerpt from ( Fallaci, 1989| ) was chosen because the situation described involves four people, two men and 
two women. 



3.1 Quantitative data 

Tables |^ through ^ provide the quantitative data for each text. 

Table gives the total number of grammatical subjects, and out of these, the total number of ani- 
mate subjects: I counted subjects in both main and subordinate tensed clauses, but I excluded impersonal 
constructions and relative clauses where the relative pronoun is the subject. 



®Most of my examples will come from fvnn Arnim as it is the largest source of pronominal expressions: from now on 



assume that the source of an example is (von Arnim, 198£), unless otherwise noted. 

^"^On the contrary, null subjects ca n refer to inanimat e subjects as well and even be used for discourse deixis, i.e. to refer to 



a whole preceding discourse segment (Di Eugenie, 1989). 
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Text 


Total 
subjects 


Animate 
subjects 


( 
( 
( 
( 
( 
( 
( 
( 


von Arnim, 1989 


) 


241 
27 
63 

77 
73 
39 
65 


229 
24 
50 

60 
42 
23 
37 


Fallaci, 1989D 
Mila, 1993|) 
SCI, 1994D 
Nichetti, 1993|) 


del Buono, 1993) 




Pagetti, 1993) 




La Nazione, 199^ 




Total 


630 


497 



Table 1: Total and animate subjects 



Table ^ partitions animate subjects according to grammatical person. 1 didn't distinguish between 
singular and plural pronouns, as no phenomenon I will talk about seems to be affected by such distinction. 
About 90% of referential expressions are singular, as there are 60 plural subjects out of 630 total subjects. 



Text 



(von Arnim, 1989) 



(Fallaci, 1989) 



(Mila, 1993) 



( 3C1, 1994D 



( Nichetti, 1993|) 



( del Buono, 1993D 



( Pagetti, 19931) 



(|La Nazione, 1994| ) 



Total 



Total 
229 

24 

50 

32 

60 

42 

23 

37 
497 



1st 

~w 



23 
9 





105 



2nd 


18 
1 




19 



3rd 
156 
24 
9 

22 
60 
42 
23 
37 
373 



Table 2: First, second and third person animate subjects 



Table || shows third person subjects partitioned into four classes: full NPs — this category also covers 
NPs that include a possessive adjective referring to an animate entity, which I will discuss below; strong 
pronouns; null subjects; and other anaphors, such as uno, onemasc, or tutte, allfem- 

Looking at Table it is apparent that the perce ntage of full NPs versus pronouns is not const ant through 
the e ig ht texts. The percent ages vary from 11% in ( Mila, 199^ ),p^ to between 20% and 30% in ( von Arnim 



1989| ), ([Fallaci, 1989| ), and ( pcT 



21%, 25% and 27% respectively. Then there is an increase for the 



last four texts, from 43% in ([Nichetti, 1993[ ), to 66% in ( [del Buono, 1993| ), 72% in ([La Nazione, 1994[ ) and 



finally 82% in (Pagetti, 1993). Intuitively, it makes sense that more formal prose employs longer and more 
elaborate constructions. 

It is clear that a full analysis should include full NPs as well, as about 60% of the full NPs in Table H are 
used referentially: for example, (Turan, 1995, ch. 6) discusses some intriguing results regarding the referential 
usage of full NPs in subject position in Turkish. Turan notices that, in her Turkish corpus, rough-SHIFt's 
are realized 99% of the times by means of full NPs, and never by means of a null subject — this is consistent 
with the absence of rough-SHIFt's in my pronominal data, see below. For smooth-shift's, the picture is 



^^(Mila, 1993) is probably a case in itself as it is a personal letter, and so it employs many more first and second person 
pronouns than third person ones — see Table H. 
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Text 

dvon Arnim, 1989| ) 
( Fallaci, 1989| ) 
( Mila, 1993| ) 
( SCI, 1994|) 
( Nichctti, 1993|) 
( del Buono, 19931) 
( Pagetti, 1993|) 
( La Nazione, 1994| ) 
Total 



Total 


Full NPs 


156 


45 


24 


6 


9 


1 


22 


7 


60 


26 


42 


28 


23 


19 


37 


27 


373 


159 



Strong Null Other 



23 


81 


7 


2 


16 





2 


5 


1 





11 


4 


1 


33 





1 


12 


1 


3 


1 





1 


7 


2 


33 


166 


15 



Table 3: Distribution of 3rd person subjects 



Text 



(|von Arnim, 1989D 



( Fallaci, 198^ ) 



(Mila, 1993) 



(3CI, 1994) 



(Nichctti, 1993) 



( del Buono, 1993|) 



( Pagetti, 199^ 



(|La Nazione, 1994|) 
Total 



strong 
23 
2 
2 

1 
1 
3 
1 

33 



null 
~36~ 

9 

4 

7 

13 

6 



5 

80 



Table 4: Strong pronouns and null subjects 
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more complicated: Tm-an notices that the shift to the object of the previous utterance is performed by means 
of a full NP if the object is inanimate, of an overt pronoun if the object is animate. I have started analyzing 
full NPs and their relation to centering: while I won't discuss full NPs in this chapter, some preliminary 



results can be found in (Di Eugenio, 1996). 

Finally, Table || shows just the data of interest, namely strong pronouns and null subjects. Notice that the 
null subjects in Table ^ amount to about half of those appearing in Table ^: to analyze centering transitions, 
I only considered those null subjects whose antecedents are not determined by contraindexing constraints 



(Lasnik, 1976; Chomsky, 1981). I also excluded those that appear in a conjoined main clause which is not 



the first conjunct, and such that the null subject corefers with the subject of the preceding conjunct: 

(7) Luij non sembra mai demoralizzato, e (jji va avanti ... 
He,; not appears ever frustrated, and (jji carries on 

In this case I consider the null subject to be constrained as if by contraindexing. Conjunctions do impose 
syntactic constraints that are different from those derived from simply juxtaposing clauses,^ as shown for 
example by the fact that this is one of the rare contexts in which subject pronouns are sometimes dropped 
even in English. 



4 Subject pronouns and centering transitions 
4.1 Applying centering to real text 

When analyzing real text, one realizes that many issues are still open. I will comment here on how deictics, 
possessives, and subordinate clauses affect centering. 



Deictics. In texts such as (von Arnim, 198E), ( Mila, 19"93| ) or ( |SCI, 1994 ) there is an abundance of first and 



second person pronouns, most of them singular (see Table |l|). The problem is whether situational deictics 



such as / and you are part of the Cf list or not; moreover, / in ( |von Arnim, 198*9 ) often appears with verbs 



of thought, so that the problem of how to deal with situational deictics compounds with the problem of how 
to take subordinates into account. Consider the following example, where, in the utterance preceding (|^a), 
Cb = Cp = pastore (pastormasc), and lui in (^a) refers to pastore: 

(8a) Mii e' venuto spesso di pensare che cosa terribile sarebbe 
To me^ is come often to think what thing terrible would be 

(8b) se luij si sentisse male nella sua bussola 

if hej selfj felt bad in the his small room. 

The issue is whether / belongs to the Cf list of (^a), or of (^) and (|8|b) taken together, if a complement 



clause such as (||b) is not an independent centering unit. I follow ( Walker, 1993| ) in assuming that deictics 



are always available as part of global focus, and therefore are outside the purview of centering. 

Possessives. As noted above, the full NP category in Table ^ includes NPs that include a possessive 
adjective referring to an animate entity, such as i suoi sforzi — his efforts. Possessives frequently occur — 
they constitute about one fifth of the full NPs that perform centering transitions — and provide another 
means of keeping the center of attention. 

The problem is deciding how possessives affect Cb computation and the order on the Cf list. An NP 
of type possessive in fact refers to two entities, the possessor Por and the possessed V^d'- in the following 
example, Por = Irais^ and V^d — husbandfe — in the previous sentence, Cb=lraisi, Cf=[lraisi > English 
gentlemarij]: 



'^■^ Contra (Kameyama, 199c) 
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(9a) SuOi maritofe non ha piu avuto pace, 

Heri husbandfe not has any longer had peace, 

(9b) e ogni volta che deve uscire da una stanza ... 

and every time that (she.i) has to leave from a room ... 

While Cb computation does not appear to be affected by a possessive, that behaves like a pronoun, the Cf 
ranking needs to be modified. Fed corresponds to the full NP, and thus its position in Cf is determined by 
the NP's grammatical function; as regards For, my working heuristics is to rank it as immediately preceding 
Fed if Fed is inanimate, as immediately following Fed if Ped is animate. Consider the following (contrived) 
discourse: 



(10a) / met Mary^ yesterday. 
(10b) She^ was worried. 

(10c) i. Her^ husband^ was in the hospital. 
ii. Her.i car^ wasn't working. 

In both (^0|c).i and (|l03].ii the Cb is Mary^; as regards the Cf list, in (|ic|c).i it is [husband^ (Ped) > 



Maryi (Por)] , while in (l^).ii it is [Mary^ (Por) > car^ (Ped)] • Clearly this heuristics needs to be rigor- 
ously tested. 



Subordinates. Another important issue, that has not been extensively addressed yet — but see ( Kameyama 



1997), (Suri and McCoy, 1993) — is how to deal with complex sentences that include coordinates and subor- 
dinates. The questions that arise concern whether there are independent Cb's and Cf lists for every clause; 
if not, how the Cb of the complex sentence is computed, and how semantic entities appearing in different 
clauses are ordered on the global Cf list. 

A simple example is the following discourse, for which I provide a literal, but not word by word, trans- 
lation; for the utterance preceding (pk), we have Cb(Ui_i) = vicinaj (neighbor/em), Cf(Ui_i) = [vicinaj]. 



(11a) Prima che i pigronii siano seduti a tavola a far colazione, 
Before the lazy ones^ sit down to have breakfast, 

(lib) leij e' via col suoj calessino alle altre cascine della tenuta. 

sYiGj has left with her^ buggy for the other farmhouses on the property. 

The issue is whether the Cb and Cf list are updated after the whole sentence, or whether a new Cb and Cf 
list are computed after (pT|a) : these new items would then be the input to a new computation of Cb and Cf 
list after (pl]b). It is my impression that preposed adjuncts, such as (pl]a), do affect centering transitions: the 
fact that an overt subject is used after a preposed adjunct seems to support the fact that this is a shift or 
a CENT-EST — see below — and not a simple continue from the previous utterance that could be encoded 
with a null subject, which on the contrary is not particularly felicitous here. As a working hypothesis. 



I've loosely adopted Kameyama's proposal (1993; 1997), that sentences containing conjuncts and tensed 



adjuncts are broken down into a linear sequence of centering "units" , while tenseless adjuncts don't generate 
independent centering units|^. Thus (pT|a) and (pT|b) have each distinct Cb's and Cf lists. 

4.2 Centering Transitions 

Table ^ shows the distribution of null and strong pronouns with respect to centering transitions, while Table ^ 
gives the distribution of transitions per text. 

^^The situation for complements is more complicated, and space prevents me from discussing it. 
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Type 


Total 


CONTINUE 


RETAIN 


SHIFT 


Cent-est 


Other 


zero 


80 


56 


4 


6 


12 


2 


strong 


33 


13 


4 


5 


10 


1 


Total 


113 


69 


8 


11 


22 


3 



Table 5: Distribution of centering transitions 



Text 


Type 


Total 


CONTINUE 


RETAIN 


SHIFT 


Oent-est 


Other 


( 


von Arnim, 19891) 


null 


36 


20 


4 


5 


6 


1 










strong 


23 


7 


3 


3 


9 


1 


(Fallaci, 1989D 




null 


9 


7 








2 













strong 


2 








1 


1 





(Mila, 1993) 




null 


4 


4 






















strong 


2 


1 





1 








(301, 1994 


) 




null 


7 


6 








1 













strong 




















(Nichetti, 1993]) 


null 


13 


11 








2 













strong 


1 


1 














(iel Buono, 1992) 


null 


6 


4 





1 





1 










strong 


1 


1 














( 


Pagetti, 1993 


) 


null 




























strong 


3 


2 


1 











(La Nazione, 1994) 


null 


5 


4 








1 













strong 


1 


1 














Total 




113 


69 


8 


11 


22 


3 



Table 6: Distribution of centering transitions per text 
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Tables | and | require some explanation, as they don't distinguish between SMOOTH- and rough-SHIFT, 
and include new transitions such as CENT-est. 

First of all, I don't distinguish between SMOOTH- and rough-SHIFT, as rough-SHIFt's involving pronouns 
can appear only in very specific conditions, that do not occur in my data. In fact, the conditions for a ROUGH- 
SHIFT are: 

1. Cb(U,) + Cb(U,_i) and 

2. Cb(U,) ^ Cp(U,) 

Notice that given the Cf ranking in (j^), the null or strong pronoun in subject position will always be 
Cp(Uj).[^ Thus, for a ROUGH-SHIFT to arise, pi must not be Cb(Ui), otherwise a SMOOTH-SHIFT would 
occur. For -pi not to be Cb{Ui), Ui must have at least another pronoun (otherwise if pi is the only pronoun, 
it is Cb(Ui), and being also Cp(Ui), condition 2 does not obtain). A configuration in which a rough-SHIFT 
obtains in Ui is, schematically — both &i and ea are pronouns, ea corresponds to 

U,_i : Cb = Ci, Cf = [ei > 62 > 63] 
U»: Cb = e2, Cf=[e3>e2] 

A constructed example where ([l^) and (p^) instantiate this configuration is: 

(12a) Giorgio,; c' amico di Maria^ . 
Giorgio,; is friend of Mariaj. 

Cb =?; Cf = [GiorgiOi > Mariaj] 



(12b) (l>i Ij' ha presentata/em a Giovannifc. 
(He;) her_, has introduced/ em to Giovannifc. 

Cb = Giorgioi ; Cf = [Giorgio^ > Giovannifc > Mariaj] 



(12c) Leij lofc trova antipatico. 
Shej hirtifc finds unpleasant. 

Cb = Giovannifc ; Cf = [Mariaj > Giovannifc] 



However, there are no examples of this sort in my data.0 

Moving now to CENT-est and other, also included in Tables || and ||, CENT-est — for center es- 
tablishment — captures the fact that sometimes pronouns (even the null subject!) can be used to refer 



to an entity in the global focus, and not on the Cf list of the previous utterance. Also (Grosz, Joshi, and 



Weinstein, 1995, p. 216) notices this phenomenon: 



The second case [of quasi violations of Rule 1] concerns the use of a pronoun to realize an 
entity not in the C/(U„); such uses are strongly constrained. The particular use that have been 
identi fied involve i nst ances where atte ntion is shifted globally back to a previously centered entity 
(e.g. ( iGrosz, 1977| ), (|Rcichman, 19851 )). 



However, not all occurrences of CENT-est represent a global focus shift. For example, if one postulates 
that adjuncts constitute centering units in themselves, it is possible that the shift in Ui is to an entity that 
belonged to Cf(Ui_2), where Ui_i is an adjunct preposed to U^: clearly such a shift seems to have a less 

'^^Empathy effects don't occur in my data when the subject is a pronoun: rather, they arise when the subject is a full NP's 
pertaining to a character's point of view, as in Le sue convinzioni lo trascinano fuori dalla casetta a tutte le ore — His beliefs 
drag him out of his house at all hours. 

'^^Note that (tt2c) also has another interpretation, a RETAIN with lo referring to Giorgio; rather than to Giovannifc: in this 
case though, the ROUGH-SHIFT is preferred to retain. It is clear that the semantics of the situation comes into play. 
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dramatic effect (in terms of inference load) than a shift to an entity that had been mentioned much earher 
in the discourse. 

I suspect that there may be some correlation between how global the shift is and the usage of a specific 
form, in particular the usage of a full NP. Moreover, when the null subject is used for CENT-est, the resulting 



discourse may be slightly incoherent. For example, in (von Arnim, 1989, p. 70-71), the author describes the 



pastor and his wife. After discussing the virtues of both, the author devotes the next 10 (complex) sentences 
only to the pastor; in fact, the 10th sentence doesn't talk about either of them. When in the 11th sentence 
the author uses a null subject to refer to both, the effect is slightly incoherent. Sentences 9 through 11 are 
reported here with a literal, but not word by word, translation. 

(13a) Luii non parla mai di queste cosGj, ma come (j)j potrebbero rimanere nascoste? 
He^ never talks about these things^, but how could (theyj) remain hidden? 

(13b) Qui tuttifc sanno tutto prima che la giornata volga al termine, e quel che (pi mettiamo in tavola e 
assai pill d'interesse generale del piu sbalorditivo capovolgimento politico. 

Here everybodyfe knows everything before the day comes to an end, and what (we;) have for dinner 
is of much more general interest than the most surprising political change. 

(13c) (j)m Hanno un cottage spazioso, carina, con un hel pezzetto di terreno attiguo al cimitero. 
(They,„) have a roomy, nice cottage with a sizable piece of land next to the cemetery. 

Strong pronouns are sometimes used to establish a new center by selecting a member of a set available on the 
Cf list of the previous utterance, as in (|lj) and (|l5|). In the utterance preceding (|l^), Cb = Cp = {pastore 
&L sua moglie}, {pastor & his wife}: 

(14a) (jji Sono entrambi di un'austera devozione. 

(They;) are both of an austere devotion. 

(14b) Luij lavora nella sua parrocchia con nobile dedizione, and... 
Hej works in the his parish with noble dedication, and ... 

Another such example: 

(15a) (f) Avevamo ormai finito il te 
(Wej) had already finished the tea 

(15b) e leii era salita di sopra a cambiarsi^ quando ... 
and she.i was gone upstairs to change herself j when 

It is debatable whether, once a set is available on the Cf list, also its members are: however, a null subject 
would be infelicitous both in (p^) and in (^5|b), thus providing weak evidence that the members of sets 
on the Cf list are not themselves available on the Cf list. I consider such usages of strong pronouns as 
cent-est's. 

Finally, other refers to configurations that I have left unanalyzed for the time being: they are charac- 
terized by parallelism, or by expressions that build a set out of Cb(Ui_i) and some other entity, such as sia 
lui che sua moglie — both him and his wife. It is not clear how to deal with these constructions within the 
centering framework yet. 



5 Discussion 

The reader will recall that the reason I conducted 
repeated here for convenience: 



my corpus analysis was to verify the strategies in (|^), 
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(16a) Typically, a null subject signals a continue, and a strong pronoun a retain or a shift. 

(16b) A null subject can be felicitously used in cases of retain or shift if Ui provides syntactic features 
that force the null subject to refer to a particular referent and not to Cb([/i_i). Moreover, it is the 
syntactic context up to and including the verbal form(s) carrying tense and / or agreement that makes 
the reference felicitous or not. 

I will now detail the results. 

5.1 CONTINUE after retain 

The first part of ([l^a) — null subjects used for continue — is strongly supported. Zeros are used 80% of 
the times, and there is a significant difference (x^ — 9.204, p < 0.01) between zeros and strong pronouns 
used in continue and zeros and strong pronouns used in all other transitions taken together — see the 
following contingency tablcR Thus, in its use of null subjects for continue, Italian behaves in the same 





CONTINUE 


ALL OTHERS 


zero 


56 


24 


strong 


13 


20 



Table 7: continue vs. aU other transitions 



way as languages as diverse as Japanese ( Kameyama, 198q ; Walker, lida, and Cote, 1994; Shima, 1995) and 
Turkish ( Turan, 1995D , (Turan, this volume). In fact, the usage of zeros for continue seems to be a robust 
cross-linguistic phenomenon. 

However, as the 20% percentage of strong pronouns used for continue is not negligible, I set out to 
investigate which factors may affect such a choice. I analyzed the CONTINUe's in my corpus, and I did find 
that one relevant factor is the transition preceding the continue in question. Consider Table ||, that shows 
the different possible transitions in the utterance Ui_i preceding the utterance Ui in which a continue 
occurs. The configuration in which a continue is preceded by a retain, which I call ret-CONT, differs 
from the other two because of the constraint Cp(Ui_i) ^ Cb(Ui_i) in the retain: this in a sense predicts 
that the center will shift. But if a retain is followed by a continue, as in a ret-CONT, such prediction is 
not fulfilled. 





CONTINUE 


retain 


SHIFT 


Ui-i 


Cb,__i = Cb,_2 


Cb,_i = Cb,„2 


Cb,_i / Cb,_2 




Cpi_i = Cbi_i 


Cpi-1 / Cbi_i 


Cpi-i = Cbi_i 


U,; 




Cb, = Cb,_i 
Cp, = Cb, 





Table 8: Centering transitions preceding a continue 



Before providing quantitative support for the distinct behavior of ret-CONT, I will illustrate this config- 
uration with Ex. (p7|), which provides two examples of ret-CONT: the first, in (p7|c), is realized with a strong 
pronoun; the second, in (p7|e), is realized with a null subject. In the utterance preceding (p7|a), Cb — Irais 
and Cf = [Irais] — the translation is literal but not word by word. 



^®X^ test results are reported here more as a source of suggestive evidence than as strong indicators, as the observations in 
the corpus, which come from only 8 authors, are not totally independent. 
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(17a) $i Incomincero a ricondurre il sxiOj pensiero sui suoij doveri chiedendolej ogni giorno 
(li) will start to bring her^ thoughts back to herj duties by asking her^ every day 
Cf : [IraiSj > Iraisj's thoughts, IraiSj's duties], Cb:Irais, continue 

(17b) come sta suOj maritOfc. 
how herj husbandfc is. 

Cf : [husbandfc > IraiSj] , Cb:lraisj, retain 

(17c) Non e che leij gli^ voglia granche bene, 

It's not the case that she^ cares much about hinife 
Cf : [Iraisj > husband^] , Cb:Iraisj, continue 

(17d) perche lui^ non corre ad aprirlej la porta 

because het doesn't run to open the door for her^ 
Cf : [husbandfc > IraiSj] , Cb:lraisj, retain 

(17e) ogni volta che $j si alza per lasciare la stanza; 
whenever (shej) gets up to leave the room. 
Cf : [Iraisj] , Cb:IraiSj, continue 

Moving now to the quantitative analysis of ret-CONT, Table ^ shows how ret-CONt's affect the usage 
of null and strong pronouns — CONT-CONT and SHIFT-CONT respectively refer to a continue preceded by 
another continue or by a shift. 



Type 


Total 


CONT-CONT+ 


RET-CONT 






SHIFT-CONT 




zero 


56 


51 


5 


strong 


13 


7 


6 


Total 


69 


58 


11 



Table 9: Pronoun occurrences for ret-CONT 



Compared to strong pronouns, null subjects are used 87% of the times for CONT-CONT and SHIFT-CONT 
taken together and only 45% of the times for ret-CONT. Moreover, if a zero is used in a continue, that 
CONTINUE is ten times more likely to be a CONT-CONT or SHIFT-CONT than a ret-CONT; in contrast, if a 
strong pronoun is used in a CONTINUE, that continue is as likely to be a CONT-CONT or a SHIFT-CONT 
as a RET-CONT. These trends in usage are confirmed by a strongly significant difference between zeros and 
strong pronouns used in CONT-CONT plus SHIFT-CONT, and zeros and strong pronouns used in ret-CONT 
(X^ = 10.910, p < 0.001). Moreover, there is a very strongly significant difference between zeros and strong 
pronouns used in CONT-CONT plus SHIFT-CONT, and zeros and strong pronouns used in all other transitions, 
including RET-CONT — — 16.922, p < 0.001, see Table |ol Consistently, there is no significant difference 





CONT-CONT + 


RET-CONT -1- 




SHIFT-CONT 


ALL OTHERS 


zero 


51 


29 


strong 


7 


26 



Table 10: CONT-CONT -t- SHIFT-CONT vs. aU other transitions 
between zeros and strong pronouns used in ret-CONT and zeros and strong pronouns used in transitions 
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different from continue — = 0.292, p < 0.7, see Table |Tl|. This suggests that ret-CONt's pattern 
more hke transitions different from CONTINUE than hke other CONTINUe's; in fact, all transitions other than 
CONTINUE in Table || present a rough half-half split between zeros and strong pronouns, as do ret-CONt's 
in Table |. 





RET-CONT 


ALL OTHERS 






(excluding continue) 


zero 


5 


24 


strong 


6 


20 



Table 11: ret-CONT vs. transitions different from continue 



My results on different pronominal distributions for CONT-CONT and SHIFT-CONT on the one hand, and 
RET-CONT on the other, seem to be yet another source of evidence for the hypothesis that a retain signals 
an upcoming shift: namely, not fulfilling the prediction given by the Cp seems to require the explicit signal 
provided by a strong pronoun. ^ 



Also (Turan, 1995) independently noticed the existence of ret-CONt's, and her results are compatible 
with mine: she found that in ret-CONt's, zeros decrease from 97% to 68% while strong pronouns increase 
from 1% to 11% with respect to their percentages of use for CONT-CONT and SHIFT-CONT. 

retain, shift and CENT-est. As far as retain's and shift's are concerned, the numbers are too small 
to draw any definitive conclusion. The tentative one is as follows: the examples I found do seem to support 
(|l6|b), as I will discuss below; namely, the null subject can be used in cases of retain's and shift's if there 
are enough "early" clues that force the null subject to refer to a particular referent. However, the numbers 
in themselves do not identify any preferred usage for strong pronouns for these transitions, contrary to what 
claimed by (|l^a). 

CENT-est's pattern hke retain and shift (and ret-CONt!), in that zeros and strong pronouns appear to 
be evenly distributed; moreover, there is a significant difference between zeros and strong pronouns used for 
CENT-est's and zeros and strong pronouns used for CONT-CONT plus SHIFT-CONT (x^ = 10.624, p < 0.01). 

A topic for future work is to investigate which factors, if any, affect the choice between null and strong 
pronouns in these configurations, especially because null subjects used for SMOOTH-SHIFT or for CENT-est 



sometimes result in a slightly less coherent discourse — see (13 



5.2 Verb agreement, clitics, and null / strong pronouns 



The second part of my claim, (|lqb), namely, that a null subject can be used if Ui provides syntactic clues 
that force the null subject not to refer to Cb(Ui_i), is indeed borne out — however, given the small number 
of occurrences of null subjects encoding these transitions (four retain's and six shift's) this conclusion can 
just be tentative. The most frequent clue is agreement in gender and / or number; in some examples, clitics 
are useful for disambiguation as well, but I found no example of clitic climbing as discussed with respect to 
Ex. (|). 

However, I hoped to be able to verify a stronger claim, that whenever such clues are available a null 
subject is used. But the data only partly support this stronger claim. In fact, of the 9 instances of strong 
pronouns realizing a retain or shift, 4 do have clues that should make a null subject possible. Two of the 
four examples, both retain's, are : 



^'^The only researcher I know of who argues against the prediction associated with a RETAIN is Linson: in (1993), he presents 
evidence based on a corpus study, in which a RETAIN is followed by a CONTINUE 50% of the timea^ and by a SHIFT onhj 15% 
of the times. However, Turan notices (p.c.) that Linson used the Cf ranking in (|l|), and that if (|l|) is amended as in (H), i.e. 
taking "empathy" into account, Linson's results may be different, in that certain retain's may in fact be CONTINUE's. 
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(18a) loi faccio visita a leij una volta aU'anno, 
li pay visit to herj one time per year, 

(18b) e leij mii ricambia la visita quindici giorni dopo. 
and shcj to me^ returns the visit fifteen days later. 

(19a) (f>i E pronta a difenderloj in ogni occasione contro di noifc. 
(Shei) is ready to defend hinij in every occasion against of us^. 

(19b) Luij non le^ parla mai. 
Hej not to her; talks ever. 

Both (^Sjs) and (pj|b) have a clitic available before the main verb, analogously to (||c;).iii: thus, substituting 



the strong pronoun with a zero should result in a coherent discourse, but this is not the case. (Turan, 199f:) 
notices that in Turkish the rule that prescribes using zeros in a continue is overridden if the pronominal 
expression has to carry additional pragmatic information, such as phonetic prominence or a listing reading. 
In the case of (p^), clearly parallelism comes into play. In (^9|b), there is indeed a contrast between lei (a 
female guest) defending lui (the author's husband), and lui trying to ignore lei as much as possible. 

In contrast to (|l^b) and (|l9|b), I would like to mention two examples, again both RETAIn's, that pattern 
like Ex. (^).ii, namely, where the clitic is found "too late" to allow the usage of a null subject. 

The first example can be found in (pT|d) above. The clitic le (for her) is cliticized to the infinitive aprire 
(to open), which is an adjunct to the main verb corre (runs). The second example is (pO|d): the clitic le (to 



her) doesn't climb in front of the modal vuole (wants), but is cliticized to the lower verb correr(e) (to run). 

(20a) Ma luifc doveva sposare la cuocajjem, 
But hefe wanted to marry the codkj jem, 

(20b) e la cuocaj ha visto un fantasma 
and the cook^ has seen a ghost 

(20c) e (pj ssj n'e andata/em su due piedi, 
and self, is gone on two feet, 

(20d) e luifc vuole correrle^ dietro ... 

and hefc wants to run to her^ after ... 

6 Conclusions 

The work presented in this chapter aims at explaining the different usages of Italian pronominal subjects 



in terms of centering transitions. The current research follows up on and extends ( Di Eugenio, 1990 ). My 
goal was to test the claims made in my earlier work and based on constructed examples against naturally 
occurring data. Not surprisingly, conducting a corpus analysis was useful not just to verify those hypotheses 
but also to extend the analysis in a variety of ways. The hypothesized strong preference for null subjects 
in the case of continue is verified. Furthermore, taking the transition preceding a continue into account 
provides an elegant explanation for about half of the strong pronouns used in CONTINUe's: a continue 
preceded by a retain behaves differently from one preceded by a continue or by a shift. 

The results regarding the usage of strong pronouns for retain and shift are mixed: in fact, the numbers 
don't indicate any preference for one pronominal form over the other. Somewhat to my surprise, I found 
that what is supported, at least tentatively given the small numbers, is the second part of my claim, (|l6|b): 
a null subject can be used for retain or shift if the context up to and including the verbal forms marked 
for tense and agreement provides "early enough" clues that prevent pronominal interpretation garden paths. 
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It is clear that it is necessary to refine the analysis first of all by collecting more instances of retain 's and 
shift's, and of CONTINUe's occurring after retain. Moreover, other pragmatic factors, such as parallelism 
and contrast, should be examined, in order to understand how they affect the choice of referring expressions. 

I think a fruitful direction in which to move is to study the functions of referential full NP's in terms of 
centering transitions. Some preliminary results, available in (Di Eugcnio, 1996), show that the percentage 
of CONTINUe's realized by means of full NP's is not negligible at all, as it amounts to 16%; and that full 
NP's account for the majority of CENT-est: the preference for full NP's over other referring expressions 
for CENT-est is statistically significant. If CENT-est's do correspond, at least in part, to shifts in global 
focus, as mentioned in Sec. ^, an issue to tease apart concerns the conditions under which full NP's, strong 
pronouns and zeros are used. In general, by including full NP's in the analysis, a more complete account of 
the choices of referring expressions will be possible. 
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